Identifying Collocations to Measure Compositionality: Shared Task System Description
نویسنده
چکیده
This paper describes three systems from the University of Minnesota, Duluth that participated in the DiSCo 2011 shared task that evaluated distributional methods of measuring semantic compositionality. All three systems approached this as a problem of collocation identification, where strong collocates are assumed to be minimally compositional. duluth1 relies on the t-score, whereas duluth-2 and duluth-3 rely on Pointwise Mutual Information (pmi). duluth-1 was the top ranked system overall in coarse–grained scoring, which was a 3-way category assignment where pairs were assigned values of high, medium, or low compositionality.
منابع مشابه
Measuring the Compositionality of Collocations via Word Co-occurrence Vectors: Shared Task System Description
A description of a system for measuring the compositionality of collocations within the framework of the shared task of the Distributional Semantics and Compositionality workshop (DISCo 2011) is presented. The system exploits the intuition that a highly compositional collocation would tend to have a considerable semantic overlap with its constituents (headword and modifier) whereas a collocatio...
متن کاملShared Task System Description: Frustratingly Hard Compositionality Prediction
We considered a wide range of features for the DiSCo 2011 shared task about compositionality prediction for word pairs, including COALS-based endocentricity scores, compositionality scores based on distributional clusters, statistics about wordnet-induced paraphrases, hyphenation, and the likelihood of long translation equivalents in other languages. Many of the features we considered correlate...
متن کاملShared Task System Description: Measuring the Compositionality of Bigrams using Statistical Methodologies
The measurement of relative compositionality of bigrams is crucial to identify Multi-word Expressions (MWEs) in Natural Language Processing (NLP) tasks. The article presents the experiments carried out as part of the participation in the shared task ‘Distributional Semantics and Compositionality (DiSCo)’ organized as part of the DiSCo workshop in ACLHLT 2011. The experiments deal with various c...
متن کاملRelative Compositionality of Multi-word Expressions: A Study of Verb-Noun (V-N) Collocations
Recognition of Multi-word Expressions (MWEs) and their relative compositionality are crucial to Natural Language Processing. Various statistical techniques have been proposed to recognize MWEs. In this paper, we integrate all the existing statistical features and investigate a range of classifiers for their suitability for recognizing the non-compositional Verb-Noun (V-N) collocations. In the t...
متن کاملDistributional Semantics and Compositionality 2011: Shared Task Description and Results
This paper gives an overview of the shared task at the ACL-HLT 2011 DiSCo (Distributional Semantics and Compositionality) workshop. We describe in detail the motivation for the shared task, the acquisition of datasets, the evaluation methodology and the results of participating systems. The task of assigning a numerical score for a phrase according to its compositionality showed to be hard. Man...
متن کامل